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Abstract 

We present an overview of the autonomous helicopter 
project at Carnegie Mellon’s Robotics Institute. The goal 
of this project is to autonomously fly helicopters using com- 
puter vision closely integrated with other on-board sen- 
sors. We discuss a concrete example mission designed to 
demonstrate the viability of vision-based helicopter flight 
and specify the components necessary to accomplish this 
mission. Major components include customized vision pro- 
cessing hardware designed for high bandwidth and low la- 
tency processing and 6-degree-of-freedom test stand de- 
signed for realistic and safe indoor experiments using model 
helicopters. We describe our progress in accomplishing an 
indoor mission and show experimental results of estimating 
helicopter state with computer vision during actual flight 
experiments. 

Introduction 

Precise maneuverability of helicopters makes them use- 
ful for many critical tasks including rescue and security 
operations, traffic monitoring, mountain fire fighting, 
and inspection of power transmission lines. The goal of 
our project is to build a vision-guided helicopter capable 
of performing these tasks while flying autonomously. In 
addition to robust helicopter control methods, the de- 
velopment of such a system requires research on vision 
algorithms for helicopter positioning and object recog- 
nition necessary for navigation and tracking tasks, to- 
gether with real-time hardware for high speed, robust 
execution of these tasks. 

An autonomous helicopter’s performance is critically 
dependent on accurate and frequent estimates of its po- 
sition and attitude. We focus on methods to provide 
these estimates using on-board cameras closely inte- 
grated with other sensors such as gyroscopes and ac- 
celerometers. 


We have demonstrated our first results on au- 
tonomous helicopter flight. We have built an indoor cal- 
ibrated testbed that allows free flight experiments with 
model helicopters. We have custom designed vision 
hardware which integrates data from on-board sensors 
with real-time image processing and can now achieve 
frame-rate (30 Hz) vision-based state estimation. Inte- 
grating this vision hardware into a stable control sys- 
tem will lead to outdoor autonomous helicopter flight 
for performing useful, practical missions. 

Motivation 

A helicopter is an indispensable air vehicle for emer- 
gency operations, such as rescuing stranded individuals 
and spraying fire extinguishing chemicals for fighting 
forest fires. Uses of helicopters in the electric power in- 
dustry include inspecting towers and transmission lines 
for corrosion and other defects. All of these applications 
demand dangerous flight patterns in close proximity to 
the ground or other objects which can risk pilot safety. 
An unmanned helicopter that operates autonomously 
or is piloted remotely will eliminate these risks and in- 
crease the helicopter’s effectiveness. 

Typical missions of autonomous helicopters require 
flying at low speeds to follow a path or hovering near 
an object. Positioning equipment such as Inertial Nav- 
igation Systems (INS) or Global Positioning Systems 
(GPS) are well suited for long range, low precision heli- 
copter flight and fall short for very precise, close prox- 
imity flight. Maneuvering helicopters close to objects 
requires accurate positioning in relation to the objects. 
Visual sensing is a rich source of data for this relative 
feedback. 

It is difficult, however, to recover helicopter position 
and attitude from vision alone. For instance, distin- 
guishing between rotation and translation in a sequence 
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of images under perspective projection is extremely dif- 
ficult. On the other hand, the new generation of light- 
weight gyroscopes and angular rate sensors in the mar- 
ket provide reliable measurement of angular change in 
an image sequence. For this reason, we concentrate on 
low-level, close integration of such sensors with vision. 


Related Work 

The study of the helicopter control problem is not new. 
Overcoming the inherent instability of helicopters has 
been the focus of a large body of research, includ- 
ing detailed mathematical models (eg., [10]) for con- 
trol and Kalman filtering of multiple sensor data for 
state estimation (eg., [3]). The controller design meth- 
ods range from linear quadratic (LQ) design to H°° 
design [19] and predictive control [8]. For example, a 
stable closed loop control system has been formulated 
[3] by quadratic synthesis techniques for helicopter au- 
tolanding. 

Recently, incorporation of a pilot model has been at- 
tempted based on quadratic optimal Cooperative Con- 
trol Synthesis [17]. This model is used for control aug- 
mentation where the control system cooperates with the 
pilot to increase aircraft performance. The sophisti- 
cated pilot model developed by [7] attempts to describe 
the human’s ability to look ahead, which is crucial to 
precise low- altitude helicopter control. While it is dif- 
ficult to identify and verify these models, they provide 
a valuable basis for an intelligent helicopter controller, 
especially in designing low-level control loops. In this 
project, we employ a set of low-level controllers which 
have been designed by using a simplified helicopter dy- 
namics model. 

Actual flight tests of helicopter controllers have also 
been done. Notable implemented systems include those 
at NASA Ames Research Center [17], NASA Lang- 
ley Research Center [3], and military aircraft manu- 
facturers [5]. Fuzzy controllers have been successfully 
employed for actual helicopter flight experiments. In 
Japan, Sugeno’s group at Tokyo Institute of Technol- 
ogy [14] has demonstrated fuzzy control of helicopters 
for crop dusting. 

The state feedback for the above helicopter con- 
trol experiments was primarily provided by on-board 
INS/GPS or ground-based beacon systems instead of 
on-board computer vision. Recently, we are beginning 
to see promising results in real-time vision-based pro- 
cessors, visual servoing of robotic manipulators, and ac- 
curate vision-based position estimation systems, some 
of which are applicable to autonomous helicopter con- 
trol experiments. 


The development of low cost special-purpose im- 
age correlation chips and new multi-processor architec- 
tures capable of high communication rates has made 
a great impact on image processing. Examples of vi- 
sion systems built from this kind of hardware include 
transputer-based image hardware for two-dimensional 
object tracking [4] and real-time tracking and depth 
map generation using correlation chips [9]. 

The high rate of image processing has made inclu- 
sion of visual feedback in servo loops practical. There 
is significant development in visual control of manip- 
ulators carrying small cameras, eye-in-hand configura- 
tion. Researchers at Carnegie Mellon’s Robotics Insti- 
tute [12] demonstrated real-time visual tracking of ar- 
bitrary 3D objects traveling at unknown 2D velocities 
using a direct-drive manipulator arm. The Yale spatial 
robot juggler [13] demonstrated transputer-based stereo 
vision for locating juggling balls in real time. Real-time 
tracking and interception of objects using a manipula- 
tor [11] has also been demonstrated based on fusion of 
the visual feedback and acoustic sensing. 

Controlling by vision requires position estimation rel- 
ative to desired objects and extraction of 3D scene 
structure based on sequence of images. RAPiD and 
DROID [6], developed by Roke Manor Research Lim- 
ited, are systems designed for performing such tasks 
in unknown environments. RAPiD is a model-based 
tracker capable of extracting the position and orien- 
tation of known objects in the scene. DROID is a 
feature-based system which uses the structure-from- 
motion principle for extracting scene structure using 
image sequences. Real-time implementations of these 
systems have been demonstrated using dedicated hard- 
ware. 

Integrating efficient model-based and connectionist 
techniques with powerful hardware architectures has 
produced an array of autonomous land and air ve- 
hicles. Significant advances in autonomous automo- 
biles has demonstrated vision-based control at high- 
way speeds. Most notable are Carnegie Mellon’s 
Navlab [16] project and the work of Dickmanns at Uni- 
versity of Bundeswehn, Munich involved with European 
PROMETHEUS project [2]. 

Dickmanns applies a 4D approach exploiting spatio- 
temporal models of objects in the world to autonomous 
land and air vehicle control [1]. He has demonstrated 
autonomous state estimation for an aircraft in landing 
approach using a video camera, inertial gyros and an 
air velocity meter. Vision-based state estimation is also 
pursued at NASA Ames Research Center [15] using par- 
allel implementation of multi-sensor range estimation 
for helicopter flight. 
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Figure 1: Indoor Testbed 

An electrical model helicopter is supported by six light-weight 
graphite rods. A frictionless air bearing couples each rod with 
two-degree-of-freedom joints mounted on poles secured to the 
ground. Ground truth helicopter position is calculated from joint 
angles measured by shaft encoders. 

Indoor Helicopter Testbed 

For practical, calibrated experimentation, we have de- 
signed and built an indoor helicopter testbed. It con- 
sists of an electrical model helicopter mounted on a 6- 
degree-of-freedom (6-DOF) test stand (see Figure 1). 
Using the testbed, we can test each critical component 
necessary for autonomous flight before attempting po- 
tentially dangerous outdoor free flight experiments. 

Model helicopters provide an inexpensive, safe, and 
logistically manageable way to experiment with heli- 
copter control. They are faithful reproductions of full 
size helicopters with respect to the crucial rotor controls 
and configurations. Control techniques developed for 
the model helicopters can be directly applied to larger 
scale helicopters. 

The helicopter in our testbed is attached to a fric- 
tionless 6-DOF stand as shown in Figure 1. The stand 
provides ground truth measurement of the helicopter 
position and attitude, and also works as a safety de- 
vice preventing crashes and out-of-control flight. The 
helicopter on the stand can fly freely in a cone-shaped 
volume six feet wide and five feet tall without major 
inertial variations from free flight. The helicopter is 
fastened to six fixed poles by six light-weight graphite 



Figure 2: Testbed System Configuration 


rods. Each graphite rod is free to move through a fric- 
tionless air bearing mounted on a two-degree-of-freedom 
joint. The joint angles are measured by shaft encoders 
and used by the computer to calculate the helicopter’s 
ground truth position and attitude for experiment eval- 
uation. 

The computer system configuration, shown in Fig- 
ure 2, consists of a host computer, customized vision 
processor, a real-time processor, synchronization hard- 
ware, and interfacing equipment. A hand-held radio 
transmitter used by a model helicopter pilot is inter- 
faced to real-time computers. Using this interface, we 
can send computer control signals to the helicopter. 
The same interface can be used for free flying heli- 
copters. 

With this testbed, we can perform controlled exper- 
iments over a wide range of conditions. We can create 
various wind conditions by using fans, terrain condi- 
tions by placing objects, and helicopter setups by ad- 
justing the mechanisms. Because of the safety provided 
by the testbed, even potentially disastrous situations 
like the failure of critical helicopter parts can be tested. 

Using a simplified helicopter dynamics model we have 
implemented a control system capable of hovering the 
helicopter using linear controllers tuned at different op- 
erating points. This control system provides us with a 
stable platform necessary for conducting low-speed and 
hovering experiments. 

One apparent limitation of the test stand is its in- 
ability to support larger model helicopters capable of 
lifting several sensors at once. On the other hand, since 
the test stand provides ground truth data, we can sim- 
ulate data from certain sensors by purposely corrupting 
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Digitizer Configuration 

The helicopter has multiple on-board sensors: two 

ground-pointing black and white CCD video cameras, 
vertical and directional gyroscopes, and accelerometers 
for each translational axis. The data from these sensors 
is digitized using multiple special-purpose digitizers. In 
particular, our system provides variable sampling rates 
for image digitization. Typically, the NTSC video signal 
is sampled at 14.3 MHz which yields close to 1000 pixels 
per line. Conventional video digitizers choose 512 or 640 
of these pixels per line during digitization. Since most 
CCD cameras have less than 1000 CCDs per line, we 
directly control digitizer sampling to reduce image data 
bandwidth and to provide more original image content. 
The aspect ratio of the image changes with sampling 
frequency and must be properly calibrated. 


Figure 3: Vision Processor Structure 

the stand data before using it. Different sensors can be 
individually characterized by comparing their response 
with ground truth data and their presence on-board the 
helicopter can be simulated during experiments. 

Low Latency Vision Hardware for 
Helicopter Control 

Our experience controlling model helicopters using the 
test stand has shown the necessity of velocity and po- 
sition feedback rates of 15 to 30 Hz. Processing image 
data at these rates requires fast computers capable of 
acquiring and processing images at frame-rate (30 Hz). 
There are a number of new cost-effective compact CPU 
platforms designed for high speed data transfer and pro- 
cessing. Among the most popular are: SGS-Thomson 
inmos T9000 Transputer, Intel i860, and Texas Instru- 
ments TMS320C40 Digital Signal Processor (C40). Our 
development is based on the C40 platform primarily 
for its high speed communication ports each capable 
of transferring data at 20 MB/s. Other advantages in- 
clude: programmable Direct Memory Access (DMA) 
well-suited for image windowing operations, flexible 
memory architecture and internal bus structure, and 
wide availability. The structure of our customized vi- 
sion processor is shown by Figure 3. 

We have achieved close integration of vision with 
other on-board sensors using customized hardware de- 
signed to interface with an array of C40 processors. 
This low-level integration is key in providing robust ve- 
locity and position estimation. 


Convolution and Image Tagging 

Fast convolution is essential for image preprocessing. 
In addition to edge detection and smoothing, matching 
and feature extraction can be performed using special 
convolution masks. We use real-time convolution hard- 
ware to perform Gaussian smoothing before processing 
images. To reduce image data bandwidth, we subsam- 
ple the image using the digitizer before performing the 
smoothing operation. For the experiments described in 
this paper, 8x8 convolution masks were used on images 
sampled at 6 MHz pixel frequency. 

Using similar convolution hardware, accelerometer 
and gyroscope data are sampled at 120 Hz and fil- 
tered by 64x1 Gaussian FIR filters. The filtered data 
is sampled and incorporated in the image data stream 
by an image tagger. Precise temporal matching of this 
data with the image is performed by using the camera’s 
60 Hz field vertical sync clock (VSYNC) and shutter 
speed. We use 1 millisecond shutter speed for tagging 
images accurately and reducing image blurring during 
helicopter motion. 

High Speed Data Link 

Because of the camera’s VSYNC frequency, the pro- 
cessing time period for the sensor-tagged field images 
can only be multiples of 16.7 milliseconds. Barely miss- 
ing an image due to long processing time is expensive 
since the processor must wait for a new image for proper 
synchronization. Image field digitization alone requires 
16.7 milliseconds. During this time period the image 
must be transferred to the processor in order to achieve 
frame-rate (30 Hz) performance. We perform this trans- 
fer through a high speed data link designed to commu- 
nicate with C40 processor comm- ports. This link incor- 
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porates small hardware buffers to convert the incoming 
synchronous image stream to the asynchronous comm- 
port protocol of the C40. In addition, since the image 
data is not directly entering a frame buffer, the high 
speed link provides proper comm-port synchronization 
with the camera using an internal state-machine. The 
comm-port design reduces CPU memory bus traffic by 
using C40’s internal data buses and provides the ability 
to only transfer regions of interest using C40s versatile 
DMAs. These functions are crucial in improving pro- 
cessor speed. 


Search Mission 

As a concrete mission for an autonomous vision-guided 
helicopter, we envision a task of locating a known ob- 
ject in a predetermined outdoor area, for example, a 
particular car in a parking lot, and tracking the object 
by controlled helicopter flight. 

The development of the indoor test stand allows us to 
conveniently simulate search mission scenarios using a 
variety of objects and terrain for visual tracking experi- 
ments. By carefully choosing these indoor experiments, 
we expect similar performance outdoors. The differ- 
ences in flight altitude and terrain illumination can be 
resolved by small modifications to camera lenses, shut- 
ter speeds, and digitizing hardware. 

Our mission is to search for a small car stranded 
somewhere in rough terrain. Performing this task re- 
quires object recognition to find the car, and visual 
measurement of position and velocity for autonomous 
flight. We have covered the stand base with gravel col- 
lected from the outdoor mission site to provide a real- 
istic scene for our vision algorithms. 

Velocity and Position Measurement 

To measure helicopter velocity or position based on im- 
age data, we must first determine the displacement be- 
tween consecutive images. This displacement in cam- 
era pixel coordinates is a function of camera attitude 
and distance relative to objects in the scene and cam- 
era calibration parameters such a s focal length. For 
the indoor search experiments, camera attitude is es- 
timated by gyroscopes and camera distance from the 
ground is estimated using the test stand. Performing 
outdoor experiments without the test stand requires al- 
titude measurement by stereo vision possibly integrated 
with a laser rangefinder or microwave radar system. 

The apparent displacement between consecutive im- 
ages is a result of camera translation and rotation. Dis- 
ambiguating rotation from translation is especially im- 
portant for helicopter control since helicopter transla- 



Figure 4: Effect of helicopter rotation 


tion is directly a result of its change in attitude. Fig- 
ure 4 shows the significance of this effect while the he- 
licopter flares for reducing forward speed or stopping. 

By carefully measuring the angular change between 
templates and images, we can estimate the effect of ro- 
tation and correct the image displacement to only re- 
flect translational motion. This correction is useless 
without precise synchronization of gyroscope data with 
images. The drift common to all gyroscopes is not a 
problem here, since only the change in attitude is nec- 
essary from frame to frame. 

Image Displacement Measurement 

We use template matching to measure the displacement 
between consecutive images. We use sum-of-squared- 
differences (SSD) as our matching criteria. Each tem- 
plate is an m x n window of image intensities selected 
from the previous image. The best match of the tem- 
plate in the image can be determined by minimizing the 
SSD of the template and image pixels. To reduce the 
amount of computation, we restrict our search area to a 
small window around the template’s neighboring pixels. 
The size of this search area is determined by helicopter 
altitude and anticipated worst case change in helicopter 
motion within one frame period. As the helicopter al- 
titude decreases, the same translational motion causes 
a larger displacement in the image. The minimum al- 
titude of the test stand is 1 meter and the on-board 
camera lens has 7.8 mm focal length. If we allow max- 
imum helicopter velocity to be 1.5 meters per second 
during hover, our maximum image template displace- 
ment is 32 pixels per frame. 

A coarse to fine strategy further improves search area 
and speed. We begin by using every fourth pixel to 
produce a coarse match for narrowing the search to 64 
possible pixel locations. This estimate is improved to 
subpixel accuracy by fitting a parabolic surface to the 
SSD of the 64 match candidates. Figure 5 shows an 
example of a fitted parabola. A good parabola fit will 
refine the best single pixel match within ±1 pixel. The 
parabola minimum is disregarded if it is not within one 
pixel of the single pixel match. 
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Figure 5: SSD Parabola Fit 


In addition to subpixel accuracy, the fitted parabola 
provides match uncertainty information. A steep 
parabola versus a shallower one signals a more accurate 
match. Covariance matrices constructed from parabola 
coefficients will allow us to combine data from each tem- 
plate using a Kalman filter to produce the best estimate 
of image displacement. 

For experiments reported here, we use four image 
templates for velocity and one template for position es- 
timation as shown by Figure 6. The four velocity tem- 
plates are 40 x 40 pixels in size and are positioned in 
each image quadrant. After each matching operation, 
the displacement of each template is calculated and the 
templates are updated with new image data from the 
same location. 

Actual velocity measurement during flight is shown 
by Figure 7. This figure compares ground truth lateral 
and longitudinal velocity measurement (solid line) from 
the test stand with vision-based velocity estimates. The 
dashed and dotted lines in each figure represent vision- 
based velocity estimates with and without attitude cor- 
rection. The correction was performed by measuring 
the attitude change between each template-image pair. 
Assuming images are taken from a locally flat surface, 
we can construct a transform, based on helicopter alti- 
tude and camera focal length, to convert the attitude 
change to a correction vector on the image plane for 
each template location. The effect of this correction is 
significant: 33 cm/s RMS error in lateral velocity mea- 
surement without attitude correction versus 5 cm / s af- 
ter correction. 

The position estimation template is 64 x 64 pixels 
in size and its location varies as the helicopter moves. 
This template is updated with image data from the best 
match in order to compensate for changes in helicopter 



Figure 6: Image Templates 



Figure 7: Vision-Based Velocity Measurement 

The solid lines represent ground truth helicopter velocity from 
the test stand. The dotted lines show velocity based on image 
displacement alone find the dashed lines represent vision-based 
velocity with attitude correction. 
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Figure 8: Position Measurement 

The solid line represents ground truth helicopter position from 
the test stand. The dotted line shows position based on image 
displacement . 

altitude and heading. If the best template match is 
close to leaving the camera view, the position template 
is loaded from the image center. A larger search area of 
64 pixel displacement is used due to longer processing 
time. Figure 8 shows vision-based (dashed line) and 
ground truth (solid line) lateral position with respect 
to camera starting point. Attitude correction is more 
complicated in this case since the template changes po- 
sition in the image plane. The figure shows uncorrected 
position estimation. 

Position and Velocity Data Flow and Synchro- 
nization 

We can not overemphasize the role of accurate synchro- 
nization in integration of on-board sensor data with 
high speed image processing. As observed above, atti- 
tude correction by synchronizing image and gyroscope 
data produces a significant improvement on position 
and velocity measurement accuracy. Figure 9 shows 
the data flow and synchronization we are performing 
for above helicopter motion estimation. 

The solid vertical lines represent the camera VSYNC 
from the second image field (B). For high speed per- 
formance, only one image field (A) is used for mo- 
tion estimation. The process begins with opening the 
camera shutter for 1 millisecond prior to VSYNC. Fil- 
tered gyroscope and accelerometer data is sampled with 
VSYNC and included in the image data stream by the 
image tagger. The tagged image is transferred to C40-1 
which partitions field A for other C40s. The top half 
of field A is used by C40-1 and the bottom half by 
C40-2 for velocity estimation. In addition, field A is 
transferred to C40-3 for position estimation. Due to 
the high band-width of connections between C40s, it is 


Figure 9: Data Flow and Synchronization 

possible to start image processing during image trans- 
fer. The transferring is performed by DMAs which do 
not interfere with data processing. C40-1 also trans- 
fers synchronized gyroscope and accelerometer data to 
C40-4 which is responsible for state estimation and con- 
trol. The state estimation is performed by transforming 
image displacement data from other C40s to helicopter 
translational motion. The estimated translational ve- 
locity and position in conjunction with accelerometer 
and gyroscope data are used by linear control loops to 
control the helicopter. 

Object Search 

The vision-based velocity and position estimation pro- 
vides the basic capability for hovering and low-speed 
flight necessary for our indoor search mission. Locating 
the object of interest is the next step. We use template 
matching to perform this search. A major difficulty 
in this approach is that object orientation is unknown. 
This requires templates of the object in all possible ori- 
entation for matching. Methods such as K-L expan- 
sion [18] can be used to reduce computational complex- 
ity and storage of necessary templates. Another prob- 
lem stems from varying helicopter altitude which will 
change the size of the object in the image. Close regula- 
tion and measurement of helicopter altitude is necessary 
to further reduce the complexity of the search. 

We are conducting the search using a set of twenty 
32 x 32 templates. These twenty templates, generated 
by K-L transform techniques, are sufficient for locating 
objects with ±40° orientation discrepancy as accurately 
as one degree resolution. The processing frequency for 
searching the entire image is 3 Hz using one C40 proces- 
sor. Upon locating the object, the position estimator 
can now use the object in the image as its template pro- 
viding relative helicopter position necessary for object 
tracking. 
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Conclusions 

We have successfully developed the key components 
necessary for vision-guided autonomous flight. As our 
experimental results demonstrate, we are achieving 
real-time low latency image processing at suitable rates 
to stably fly helicopters. The major elements in our de- 
velopment have been custom designed vision hardware 
and indoor testbed. In addition to high speed process- 
ing, customized hardware provides flexible integration 
of on-board sensors which significantly improves vision- 
based state estimation. The indoor testbed provides 
convenient calibrated experimentation which is essen- 
tial in building real autonomous systems. 
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